AI/ML and Data Infrastructure for Microscopy

Maria K Y Chan1

1Argonne National Laboratory

The combination of high throughput computational modeling and in situ/operando microscopy and spectroscopy experiments has given rise to significant opportunities to study an ever-increasing number of materials in ever-increasing detail. However, the integration of computational modeling with experimental measurements still often happens in a post-hoc and inefficient way. Data science techniques such as machine learning (ML), artificial intelligence (AI), and computer vision may be able to accelerate this integration. In this talk, we will discuss how we use AI/ML in conjunction with theory-based modeling to extract information in an automated fashion from microscopy data. We will describe our FANTASTX (Fully Automated Nanoscale To Atomistic Structures from Theory and eXperiment) [1] and Ingrained [2] codes which allow such integration, as well as EXSCLAIM! [3] and Plot2Spectra [4] codes for extracting microscopy and spectroscopy data from scientific literature. We will discuss the improvements to EXSCLAIM afforded by recent developments in large language models, as well as how data extracted from EXSCLAIM is used to aide tagging, data infrastructure, and retrieval of microscopy images. The use of these codes to discover the structure of novel materials [5] as well as understand technologically important energy materials [6] will also be discussed.

[1] D. Unruh, V. S. C. Kolluru, A. Baskaran, Y. Chen, and M. K. Y. Chan “Theory+AI/ML for Microscopy and Spectroscopy – Challenges and Opportunities”, MRS Bulletin 47, 1024–1035 (2022).

[2] E. Schwenker, V. S. Chaitanya Kolluru, J. Guo, X. Hu, Q. Li, M. C. Hersam, V. P. Dravid, R. F. Klie, J. R. Guest, M. K. Y. Chan, “Ingrained: an automated framework for fusing atomic-scale image simulations into experiments,” Small 18, 2102960 (2022).

[3] E. Schwenker, W. Jiang, T. Spreadbury, N. Ferrier, O. Cossairt, M. K. Y. Chan, “EXSCLAIM! -- Harnessing materials science literature for labeled microscopy datasets,” Patterns 4, 100843 (2023).

[4] W. Jiang, K. Li, T. Spreadbury, E. Schwenker, O. Cossiart, M. K. Y. Chan, “Plot2Spectra: an Automatic Spectra Extraction Tool,” Digital Discovery 1, 719-731 (2022).

[5] Q. Li, V. S. C. Kolluru, M. S. Rahn, E. Schwenker, S. Li, R. G. Hennig, P. Darancet, M. K. Y. Chan, M. C. Hersam, “Synthesis of borophane polymorphs through hydrogenation of borophene,” Science, 371(6534), 1143-1148 (2021).

[6] X. Liu, G.-L. Xu, K. V. S. Kolluru, et al, “Origin and regulation of oxygen redox instability in high-voltage battery cathodes,” Nature Energy 7, 808–817 (2022).